Genomic insights into local adaptation and vulnerability of Quercus longinux to climate change

Background Climate change is expected to alter the factors that drive changes in adaptive variation. This is especially true for species with long life spans and limited dispersal capabilities. Rapid climate changes may disrupt the migration of beneficial genetic variations, making it challenging for them to keep up with changing environments. Understanding adaptive genetic variations in tree species is crucial for conservation and effective forest management. Our study used landscape genomic analyses and phenotypic traits from a thorough sampling across the entire range of Quercus longinux, an oak species native to Taiwan, to investigate the signals of adaptation within this species. Results Using ecological data, phenotypic traits, and 1,933 single-nucleotide polymorphisms (SNPs) from 205 individuals, we classified three genetic groups, which were also phenotypically and ecologically divergent. Thirty-five genes related to drought and freeze resistance displayed signatures of natural selection. The adaptive variation was driven by diverse environmental pressures such as low spring precipitation, low annual temperature, and soil grid sizes. Using linear-regression-based methods, we identified isolation by environment (IBE) as the optimal model for adaptive SNPs. Redundancy analysis (RDA) further revealed a substantial joint influence of demography, geology, and environments, suggesting a covariation between environmental gradients and colonization history. Lastly, we utilized adaptive signals to estimate the genetic offset for each individual under diverse climate change scenarios. The required genetic changes and migration distance are larger in severe climates. Our prediction also reveals potential threats to edge populations in northern and southeastern Taiwan due to escalating temperatures and precipitation reallocation. Conclusions We demonstrate the intricate influence of ecological heterogeneity on genetic and phenotypic adaptation of an oak species. The adaptation is also driven by some rarely studied environmental factors, including wind speed and soil features. Furthermore, the genetic offset analysis predicted that the edge populations of Q. longinux in lower elevations might face higher risks of local extinctions under climate change. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-024-04942-8.


Introduction
Trees profoundly influence global carbon cycles and ecosystem stabilization [1][2][3].The effects of anthropogenic climate change on precipitation and temperature patterns have altered the distribution, community composition, and phenology of forest trees worldwide [4][5][6][7].However, the impacts of climate change on focal species vary [8][9][10][11][12].For example, warming temperatures may have a positive effect on the growth rate of trees from colder environments but a negative effect on trees from tropical regions [10,13].Higher temperatures and increased aridity are predicted to increase stress reactions in forest trees under drought conditions, especially tropical and subtropical tree species [14][15][16][17].
Studies of the potential response to climate change based on field experiments have emphasized the importance of phenotypic variation for adaptation to local climate and acclimation to drought and cold stresses [18,19].However, field experiments are unrealistic for species with long lifespans, such as trees, or endangered species with limited populations.An alternative approach is genotype-environment association (GEA) analysis, which identifies environmental factors that select for genetic characteristics [20,21].By predicting the vulnerability of forest trees to climate change and environmental factors that may restrict future distribution, GEA analyses can provide a foundation for conservation projects and forest management [22][23][24].
Although GEA studies have implied that standing genetic variation may enable some tree species to cope with climate change, the long lifespans and low germination rates of tree species may limit the pace of adaptation to acute and drastic environmental changes [25].Moreover, the applicability of the results of GEA studies to conservation projects may be confounded by several factors.For example, the geographical distance between source and sink populations and the composition of natural barriers to gene flow may influence pollen dispersal direction and germination rates for introgression between populations [26].These factors could reduce the spread of beneficial alleles (i.e., genetic rescue effects) [27].Incorporating landscape analyses into GEA studies could improve the evaluation of genetic vulnerability, assessment of vulnerable populations, and inference of potential dispersal routes under climate change [28,29].
The tree family Fagaceae is ecologically and economically important and has been widely studied in landscape genetics research [30,31].The varied natural habitats of Fagaceae species offer ideal study systems for exploring the effects of environment on genetic diversity, local adaptation, and the response to climate change [32,33].Studies have shown that environmental factors, such as precipitation and temperature, significantly affect the adaptive divergence of Fagaceae species [34], but the impacts of other abiotic factors, such as wind, topology, and soil, have largely been ignored [35,36].Field experiments have shown that wind and soil factors are prominent drivers of local adaptation of plant morphology and physiology [37,38], but the potential impacts of these factors on genomic architecture and phenotypic variation across the heterogeneous landscape of Fagaceae species have not been comprehensively established [19,37,39,40].This is particularly true for Fagaceae species on the subtropical island of Taiwan, where the mountainous terrain and diverse climate have created a range of habitats [41][42][43].
The rapid development of novel analytic methods in landscape genomics provides unprecedented opportunities to examine new hypotheses and assess vulnerable populations using different statistical assumptions in the context of climate change [2].If climate change disrupts the allele frequencies that underlie current geneticsenvironment relationships, vulnerable populations may become less resilient or even extinct locally [44][45][46][47].In this study, we adopted Quercus longinux Hayata, an endemic Fagaceae species in Taiwan, as the study species to evaluate the complex effects of environment on local adaptation and the response to climate change.Q. longinux inhabits mountainous ranges across Taiwan, from low to middle elevations (500 m to 2,200 m above sea level) [48].Based on its wide distribution along latitudinal and altitudinal gradients, Q. longinux is classified into three varieties (i.e., var.longinux, var.kuoi, and var.lativiolaciifolia), and large morphological variations in fruits and leaves between habitats have been observed [48].Q. longinux var.longinux and var.lativiolaciifolia grow sympatrically at low to middle elevations in Taiwan.By contrast, Q. longinux var.kuoi is allopatric with the other varieties and is limited to southeastern Taiwan.Compared with the other two varieties, Q. longinux var.kuoi has longer, broader leaves that are green on both sides when fresh and have a non-violet abaxial surface when dried [48].Environmental variations may influence several leaf traits associated with photosynthetic efficiency and acclimation to abiotic stresses [49][50][51].Identifying variations in not only genetic data but also leaf morphology that are associated with environment could shed light on the processes that gave rise to the unique Q. longinux var.kuoi in southern Taiwan.
The ecological function of oaks in subtropical forests is essential, and it will be further important to evaluate the adaptation and their vulnerability when climate change is believed to alter their distribution and deteriorate their survival.The first step to conservation is revealing their population structure and adaptation pattern and identifying potential genetic sources and vulnerable populations.Therefore, this work was guided by three objectives.First, we aimed to infer genetic structure and assess the consistency of relationships between genetic data, ecological niches, and phenotypic traits for Q. longinux var.kuoi, a special allopatric variety limited to southern Taiwan.Second, we aimed to identify environmental features that affect spatial genetic variation and adaptive genetic divergence.Third, we aimed to use adaptive genetic variants to evaluate the vulnerability of populations to climate change.

Sampling, sequencing, read mapping, and variant calling
All samples analyzed in this study were collected and identified by the authors.We collected 26 populations spanning all known distributional regions of Q. longinux in Taiwan (Fig. S1a; Table S1).Three populations in Southern Taiwan (SK, LZ, and GS) were morphologically identified as Q. longinux var.kuoi; three populations (TP, NS, JSY) were collected near the locations with documented Q. longinux var.lativiolaciifolia [48]; and the rest populations were identified as Q. longinux var.longinux.From each selected tree, a branch with more than ten mature leaves was collected for morphological measurements.Fresh leaves were stored in silica gel at 4 °C until DNA extraction.Voucher specimens of this study have been deposited at National Taiwan Normal University Herbarium (TNU) under deposition number TNU057201-TNU057214.
DNA was extracted using the modified CTAB method based on Doyle [52].DNA quality and quantity were evaluated with a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, USA) and Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, CA, United States), respectively.After quality control, DNA from 205 individuals was used for dd-RAD library construction.The library was double digested by Sbf1 and Msp1 and ligated to Illumina sequencing adapters with individual barcodes and library indices.Fragments of 250-500 bp were selected and amplified by polymerase chain reaction (PCR).Finally, the fragments were sequenced using Illumina HiSeq X (Illumina, USA) to generate 150-bp paired-end reads, at the Technology Commons in College of Life Science (National Taiwan University, Taiwan).
We trimmed the raw reads to remove adapters, reads < 50 bp, and reads with quality < 30 using fastp [53].The clean reads were mapped to the reference genome of Q. robur [54] using the mem algorithm of BWA [55].Aligned reads with a mapping quality score < 20 were discarded.PCR duplicates were marked with the Picard Toolkit (http://broadinstitute.github.io/picard/).We further performed realignment around indels using ABRA2 [56].Summary statistics, including read number, mapping rate, and coverage, were calculated from SAMtools [57].Variants were called with BCFtools [58].We first generated genotype likelihoods at each genomic position with coverage using the BCFtools mpileup command and called the variants with default parameters.After removing indels, we retained biallelic SNPs with missing rates < 0.4, minor allele frequencies > 0.01, genotypes with quality scores > 30, and mean minimum depths > 3 as filtering thresholds in VCFtools [59].

Genetic diversity, genetic structure, and demographic analysis
To evaluate and visualize the genetic clusters of Q. longinux, we first performed principal component analysis (PCA) using a genetic matrix in which missing data were replaced with mean values of each site in the R package adegenet [60].Next, pairwise F ST , populationspecific F ST , and permutational multivariate analysis of variance (PERMANOVA) were performed with the R packages hierfstat [61] and vegan [62] to evaluate differentiation among defined genetic clusters.A neighborjoining (NJ) tree was also constructed based on the Nei distances from the R package ape [63].We then employed StrAuto [64] analysis to assess the pattern of admixture among populations using 100,000 steps of MCMC chains with 25,000 steps burn-in and 10 iterations from K = 1 to 8. The results were uploaded to the CLUMPAK [65] server to generate consensus plots and evaluate the best K according to the ΔK value.
A stairway plot was used to infer historical changes in the effective population size (Ne) of Q. longinux [66].We first generated the site frequency spectra (SFSs) from easySFS (GitHub repo: https://github.com/isaacovercast/easySFS)using all SNPs and selected the projection values with the highest segregating sites for each group.We then ran bootstrapping 1,000 times to estimate median sizes and 95% CIs, assuming a mutation rate of 1.01 × 10 − 8 per bp per generation and a generation time of 50 years for oak species [67].
To provide an in-depth investigation of the evolution history between Q. longinux var.kuoi and the rest of the populations, we simulated and compared four scenarios (Fig. S1b) with different patterns of migration using fast-simcoal2 [68].All models were simulated and optimized using the composite likelihoods calculated with SFSs.Because the computation of fastsimcoal2 is demanding, observed SFSs were projected to five individuals per simulated population.A mutation rate of 1.01 × 10 − 8 per bp per year was assumed for estimating all parameters with a generation time of 50 years [67].To select the best model, we first used the Akaike information criterion (AIC), which represents the difference between the observed and expected SFSs, to rank the optimized models.Second, the parametric-bootstrap approach was used to calculate 100 likelihoods and the 95% CI for the parameters estimated by the best models and to select the best-fitting models according to the lowest bootstrapping AIC.

Environmental variables and GIS processing
Three categories of environmental variables were collected: 105 climate variables, eight soil variables, and four topological variables (Table S2).The climate variables included BIO01-19, seven monthly precipitation factors, temperature, solar radiation, and wind speed downloaded from WorldClim2 [69] and two aridity-associated layers [70].Eight physicochemical soil characteristics downloaded from SoilGrids 2.0 were used as the soil variables [71].For topography, the altitude layer was downloaded from GOV.DATA.TW (https://data.gov.tw/), and three altitude-derived layers (i.e., roughness, aspect, and slope) were computed with the R package raster [72].All variables were resampled to a resolution of 1 km 2 for the downstream analyses.To address the issue of multicollinearity, variables with VIF > 10 and Pearson correlation coefficient |r| >0.8 were discarded, leaving 20 environmental variables among the three categories (Table S3; Supplementary data) for use in the subsequent analyses.We compared differences in the environmental variables between groups using analysis of variance (ANOVA) and Tukey's honestly significant difference test applied in the R package stats [73].The boundaries of Taiwan and nearby islands were downloaded from GADM database (http://www.gadm.org/).All maps in this study were depicted by the authors using R 4.2.3 [73].

Niche differentiation assessment
To harness the power of GEA analysis to evaluate a species' ability to cope with future climate change, it is crucial to include non-genetic factors that may be associated with fitness to local environments, such as niche characteristics and phenotypic traits.To investigate and quantify the niche overlap and diversification between the three (eastern, western, and southern) genetic groups revealed in our results, we first assessed the overlap of the realized niche using the first two axes from the environmental PCA performed with the 20 retained environmental variables in the R package ecospat [74].Five hundred pseudo-absence points were added to the analyses as background.In addition, we calculated two niche overlap indices: Schoener's D and the standardized Hellinger-transformed Warren's I based on occurrence density grids.Next, we conducted the equivalency test with 100 permutations and the background test with 100 runs to evaluate the significance of niche differentiation at p < 0.05.The equivalency test and background test were implemented with the R package ENMtools [75].

Phenotypic trait analysis
To quantify the contributions of genetic and environmental factors to phenotypic variations, we measured 12 morphological leaf traits that are commonly used in morphological studies of Fagaceae and other plants (Table S3; Fig. S2; Supplementary data) [76][77][78].A total of 826 dried leaves of 191 individuals from 26 populations (1-5 leaves per individual) were recorded with a digital camera and measured using ImageJ [79].Because the measured traits may be highly correlated with leaf size, an additional six computational traits that are not typically correlated with leaf size were added [78].We first performed PCA and PERMANOVA with the six computational traits (i.e., shape-related traits and specific leaf area (SLA)) and abaxial surface color to quantify morphological differences among groups using the R package vegan [62].Biplots of the attributes and contributions were visualized with the R package factoextra [80].Differences in traits between groups were compared using ANOVA and Tukey's post hoc test.Next, we performed partial redundancy analysis (RDA) to evaluate the contributions of geography, demography, and environment to phenotypes using all scaled phenotypic traits as the response.
To further investigate the associations between morphological variables and environmental predictors, we constructed univariate generalized linear models (GLMs) with one of the leaf traits as the response (Table S4) and one of the environmental variables as the predictor in the R package MASS [81].

Detection of local adaptation and underlying gene functions Identification of environment-associated genetic variants
We integrated two F ST -based outlier detection methods and one genotype-environment association (GEA) approach to detect environment-associated SNPs.First, we implemented BayeScan [82] and pcadapt [83] to identify SNPs with a significant departure of allele frequency from neutrality while controlling for background genetic structure.BayeScan was run with posterior odds (PO) = 100, and SNPs with q < 0.01 were retained as putative outliers.Pcadapt was performed with the number of K principal components (PCs) selected by the decreasing order of the percentage of variation explained by each PC.We further implemented Benjamini-Hochberg correction [84] on the results and retained the SNPs with a false-discovery rate (FDR) < 10% as candidate outliers.
Second, we utilized a univariate approach, the latent factor mixed models (LFMM) algorithm [85,86], to discover SNPs that were significantly correlated with environmental variables while accounting for population genetic structure.Imputation of missing data and determination of latent factors (K) were performed by the snmf function in the R package LEA [85].LFMM was performed in the R package LEA [85] using the retained environmental variables with ten runs, and Z-scores from each run were combined.To further adjust the p-values, the Benjamini-Hochberg procedure [84] was used, and FDR < 10% was used to identify putative adaptive outliers.SNPs that overlapped between significant environmentassociated outliers and the BayeScan or pcadapt outliers were regarded as putative adaptive SNPs for use in downstream analyses.
To investigate and compare the roles of geography, demography, and environment in shaping the genetic variation of adaptive and all SNPs, we decomposed the relative contributions of each group of predictors using three methods.First, Mantel tests were used to test for associations between F ST /(1-F ST ) and geographic distance (isolation by distance, IBD) and environmental distance (isolation by environment, IBE) using the R package vegan [62].Environmental distances were represented by the Euclidean distances of scaled climatic, soil, and topographical variables.Values of pairwise F ST between populations were calculated with the R package hierfstat [61].Differences in F ST between adaptive and all SNPs were compared by ANOVA in the R package stats [73].
Second, we performed partial redundancy analysis (RDA) using three predictor datasets: (1) proxies of geographic structure obtained by converting the geographic coordinates to uncorrelated axes using principal coordinates of neighborhood matrices (PCNM) in the R package vegan [62] and retaining half of the positive axes; (2) proxies of demographic history obtained by performing PCA using the genetic matrix of all SNPs and retaining the PC scores from PC1 to PC2; and (3) the 20 retained environmental variables.The genetic metrics calculated from the adaptive or all SNPs were used as the response variables.To further quantify the variation explained by each group of environmental variables, we performed another partial RDA using the 20 environmental variables, which were classified into (1) climatic, (2) soil, and (3) topographical variables, as predictors of the two genetic metrics as the response.The significance of the full models and pure fractions was assessed using 999 permutations with the function anova.cca() of the R package vegan [62].
Third, we used GradientForest (GF), a non-linear and machine-learning approach derived from the random forest algorithm [87].GF was performed with the genetic metrics of the adaptive or all SNPs.The number of regression trees was set to 500, the maximum number of splits was set using the formula log 2 (0.368 × number of individuals/2), and the correlation threshold r was set to 0.5, as suggested in a previous study [87].The importance of each group of predictors was evaluated by the percentage of weighted R 2 .

Gene annotation and enrichment analysis
To further investigate the potential gene functions of putative environment-associated SNPs, we retrieved the closest genes with a physical distance < 10 Kbps as potential underlying genes using BEDOPS [88].The closest genes were further annotated to the Arabidopsis thaliana genome using KOBAS-I [89] with the criterion of e-value < 10 − 5 .After retrieving the homologous genes, gene enrichment analysis was performed with the GO and KEGG databases to detect enriched pathways using KOBAS-I.Significance was assessed with Fisher's exact test [90], and FDR was corrected with the Benjamini and Hochberg procedure [84].FDR < 5% indicated significantly enriched GO terms or pathways.

Investigating the influences of current landscape variables on genetic differentiation
To compare landscape characteristics that shaped the differentiation of adaptive or all SNPs, we constructed IBD, IBE, and isolation by resistance (IBR) models using F ST /(1-F ST ) calculated from adaptive or all SNPs as a response.Six predictor matrices were generated, including a geographical Euclidean distance metric (IBD), three environmental Euclidean distances calculated from the climate, soil, and topology factors (IBE climate , IBE soil , and IBE topology ), and two circuit-theory-based resistance layers calculated with CIRCUITSCAPE [91] using distribution maps of ecological niche modeling from the factors of climate and topography (e.g., IBR climate , IBR topography ).
We used two complementary approaches to compare the models.First, reciprocal causal modeling (RCM) [92] was performed to evaluate the relative importance of each model based on the relative values of partial Mantel tests estimated from each focal and alternative model.All partial Mantel tests were computed using the R package vegan [62] with 999 permutations.Second, we implemented the maximum likelihood population mixed-effects model (MLPE) [93] to compare each predictor based on the AIC values.Models with ΔAIC > 2 were considered to have different model fits.MLPE was implemented with mlpe_rga in the R package Resis-tanceGA [94].In addition, we performed an estimated effective migration surfaces (EEMS) analysis [95] to visualize geographic regions deviating from the assumption of IBD.Missing data were imputed and converted to genetic distance using the function bed2diffs.Next, EEMS was executed with RunEEMS_SNPS with a Deme number of 500.A total of 500,000 Markov chain Monte Carlo (MCMC) iterations were run after a burn-in of 150,000 and a thinning interval of 9,999.

Investigating the impacts of future climate change on genetic adaptedness Ecological niche modeling
To investigate the current potential distribution range of Q. longinux, we performed ecological niche modeling (ENM) using an ensemble approach based on the true skill statistic (TSS)-weighted combination of six methods in the R package sdm: maxent, glm, svm, gam, mda, and mlp [96].Five hundred pseudo-absence data were generated and added to the analyses.A total of 60 runs with 10-fold cross-validation analyses were performed.We then used the area under the receiver-operator curve (AUC) to evaluate model performance.Finally, the habitat prediction was transformed into a binary map that classified suitable and unsuitable regions using the threshold of maximum test sensitivity plus specificity.

Genetic offset assessment
Because the accessible prediction layers are limited, we only applied five temperature and precipitation variables (i.e., BIO1, BIO3, BIO12, PREC4, and PREC10) selected from the obtained factors to assess the influence of climate change on Q. longinux.We downloaded and averaged the predictions from three future models (CCSM4, MIROC-ESM, and MIROC5) in 2070 to account for model variability.We also considered two contrasting representative concentration pathways (RCPs): a lowemission model (RCP2.6)and a high-emission model (RCP8.5)from CMIP5 for 2070.Three complementary approaches were used to evaluate genetic offset.We first calculated the risk of non-adaptedness (RONA), which represents the theoretical changes needed in allele frequency to maintain the linear environment-genotype relationships with correction on weighted R 2 , implemented in pyRONA [97].We retained and discussed the top three representative climatic variables with the highest number of significant outliers.
Second, as a complementary method to RONA, we used a random forest algorithm to model the non-linear relationships between adaptive SNPs and the same climatic variables as RONA in the R package GradientForest [87].PCA was performed on the GF model to visualize the prediction of genetic variation in spatial regions, and the first three principal components (PCs) were assigned to a red-green-blue palette.The genetic offset in the GF model was calculated as the Euclidian distance between the current and future genetic composition at each grid.The binary map generated by ENM was used as a mask to limit prediction within suitable habitats.
Third, following Gougherty, Keller [98], we integrated migration to predict maladaptation to future climate using a generalized dissimilarity modeling (GDM) algorithm [2] with local, forward, and reverse genetic offsets.Local offset represents the predicted in situ change in allele frequencies with no migration.Forward offset was calculated by selecting the minimum offset between each grid in the current range and all grids in the future climate, and reverse offset was calculated by identifying the minimum offset between each grid within the current range in the future and all grids in the current climate.For forward offset, the distance required to migrate to the place with minimum forward genetic offset and the direction (bearing) of migration were also recorded.No dispersal limitation was assumed for forward and reverse offsets to any locations in suitable habitats.Finally, to simultaneously visualize local, forward, and reverse offsets, we mapped the three metrics as red, green, and blue in an RGB image.

Population structure and demographic history
On average, 70.9% of reads per sample were successfully mapped to the reference genome of Q. robur.We recovered 624,451 SNPs and retained 1,933 high-quality SNPs with an average individual missing rate of 11%.StrAuto revealed three genetic clusters (Fig. 1a): the eastern and western clusters, which are mainly separated by the central mountain range (CMR), and a cluster in the Henchung Peninsula that was limited to southern Taiwan, denoted as HC.From the result of StrAuto, individuals in each cluster were genetically admixed with other clusters (Fig. 1b).PCA and the neighbor-joining (NJ) tree also suggested that the boundaries of the three clusters were not clearly defined (Fig. 1c; Fig. S3a-b).F ST was higher between genetic clusters than within groups, and divergence was highest between HC and the eastern cluster (Fig. S4c).Similar results were obtained by PER-MANOVA (Table S5).The results of the estimated effective migration surface (EEMS) analysis also suggested the connectivity and geographic distribution of genetic diversity were mainly separated by CMR (Fig. 1d-e).According to the stairway plot (Fig. S4d), Ne has differed between HC and the other two clusters since 1 million years ago (Fig. S4d).The apparent population decline of HC 1 million years ago was preceded by a period of larger population size.By contrast, the eastern and western clusters underwent a relatively recent population decline in the last 100 Ky.The fastsimcoal2 analysis suggested no apparent gene flow between group HC and the eastern and western clusters since their divergence of 19.6 Kya (Table S6).Recent bottleneck events were also revealed at 11.5 Kya from the best models for all simulated populations.

Influence of landscape factors on population genetic divergence
We identified 182 F ST outlier SNPs (pcadapt + BayeScan) and 540 SNPs that were significantly associated with one or more environmental variables by LFMM (Fig. 2).A total of 105 putative adaptive SNPs (identified by the overlap between LFMM and F ST outliers) were found in 35 genes (Table S7) that were widely distributed across the genome and did not cluster in specific regions.F ST estimate from adaptive SNPs was higher and more significant than all SNPs (mean F ST all loci = 0.09; mean F ST outlier = 0.59; p < 0.05), indicating that selection may drive the spatial genetic differentiation between populations.Compared to all loci, the adaptive loci exhibited stronger IBD and IBE patterns (Fig. 3a).Using reciprocal causal modeling (RCM) and a maximum likelihood population mixed-effects model (MLPE), IBE was consistently identified as superior to other competitive models using adaptive SNPs when controlling population structure, indicating that genetic differentiation of the adaptive SNPs was mainly influenced by environmental variation (Fig. 3b-c; Table S8).Furthermore, the divergence of the genetic structure of HC from those of the other clusters was greater when divergence was assessed by adaptive SNPs than by neutral SNPs (Fig. S4a-b).By contrast, IBR based on topographical resistance was selected as the best model by MLPE and RCM using all sites, suggesting that the overall genetic differentiation of the species was mainly influenced by topographical dispersal barriers (Fig. 1d-e; Table S8).The conductance layers generated by the CIRCUITSCAPE algorithm also supported topographical resistance across regions, with greater barriers to dispersal at higher elevations (Fig. S4e-f ).
The results of partial RDA indicated that pure environment was the most significant variable affecting the genetic variation of adaptive SNPs among three pureeffect portions; whereas pure demography was the most important pure-effect portion of all SNPs when controlling other factors (Fig. S5; Table S9).Forward selection also identified population structure as the most  S10).Environmental variables contributed to 80% of the explained variation in adaptive loci but 60% in all loci (Fig. S5).For adaptive loci, the three groups of variables explained a large proportion of the joint effects (45% of explained variation) (Fig. S5; Table S9).Another partial RDA that decomposed the contributions of climate, soil, and topography to the genetic variation in either the adaptive or all SNPs revealed that pure climate and soil variables explained significantly more of the genetic variation in adaptive SNPs than topography variables (Fig. S5).In agreement with the results of partial RDA, GF analysis demonstrated that demography was the top variable with the highest weighted R 2 (Table S11; Fig. S6), although geography explained a higher proportion of variation after summing the total weighted R 2 (demography: 32%; geography: 47%).

Functional annotation of adapted loci
Several Q. longinux loci involved in climate and soil adaptation were associated with environmental variables (Table S7).Two functional pathways were significantly enriched in adaptive loci: oxidative phosphorylation and photosynthesis.
Furthermore, the allele frequencies of the annotated loci were associated with environmental gradients (Fig. 4a), indicating adaptation signals.Wind and soil gradients were significantly associated with the allele frequencies of ATPD, NF-Y19, cob, rpoB, and ABCG34 (Fig. 4b; Table S7).Some outliers, such as ADH1 orthologs, were associated with annual mean temperature and precipitation in spring (Fig. 4a).Other genes, including AT2G40435, the NET1D ortholog, and the HSP70 ortholog, were also involved in precipitation-and temperature-associated adaptation (Fig. 4b; Table S7).

Analyses of ecological niche and morphometric distinctiveness
Consistent with the genetic data, niche analysis of the first two axes of the environmental PCA and both overlap indices showed that the eastern and western clusters had a higher degree of niche overlap with each other than with HC (Fig. S7).HC was characterized by niches with higher mean annual temperature and lower precipitation in spring compared to those of the other clusters (Fig. S7).The equivalency test, background test, and ANOVA demonstrated that the three clusters occupied significantly different ecological niches (Fig. S7; Fig. S8).The PCA and RDA of phenotypic traits were also congruent with the genetic analyses and showed that HC was morphologically and ecologically differentiated from the other clusters (Fig. 5a-b; Fig. S9; Fig. S10), which was classified as Q. longinux var.kuoi.Although the eastern and southern clusters had a high degree of overlap according to the first two PC axes, PERMANOVA indicated significant differentiation between the two clusters (Table S12).In contrast, Q. longinux var.lativiolaciifolia was genetically and morphologically admixed with Q. longinux var.longinux, and no apparent structure could be assigned between these two varieties (Fig. 1a-b).Partial RDA using all phenotypic traits as responses demonstrated that pure geography contributed more variation than pure environment, and a large intersection between their interactions was found (Fig. 5c-d; Table S13).The GLMs revealed that leaf traits were significantly associated with environment, but the directions of the relationships (i.e., positive or negative) differed depending on the trait and environmental variable (Fig. S11).For example, leaf thickness was positively correlated with annual temperature (r = 0.22, p < 0.01), whereas leaf length was negatively correlated with annual precipitation (r = − 0.22, p < 0.01).

Genetic offset and prediction of the response to future climate change
The high AUC value (average AUC = 0.81) suggested a good model fit for the predicted distribution of Q. longinux (Table S14; Fig. S12).As the predictions under different RCPs were highly correlated (Table S15), we inferred RONA based on the average values from both predictions.Substantial variations in the RONA estimates between populations and the top three representative environmental variables were observed (Fig. 6ac; Table S15).The RONA estimates were larger in regions with greater differences between current and future climates (Fig. 6a-c).Whereas the eastern and western populations had relatively low RONA values (< 0.2) for the three variables, the northern populations were predicted to suffer from severe winter rainfall (precipitation in October, Fig. 6a) in the future and had much higher RONA values (> 0.6).
The GF model constructed with five climatic variables suggested that precipitation in winter was the most influential variable with the highest weighted R 2 (Fig. S13).The focal species exhibited strong spatial patterns, indicating adaptation to local climate conditions (Fig. 6d).Consistent with the results of the RONA analysis, the GF model estimated highly correlated results between RCPs with similar genetic offset patterns (Fig. 6e-f ).Both the RONA analysis and the GF model indicated that the northern populations were the most vulnerable to future climate change (Fig. 6e-f ).
Although the predicted patterns of local, forward, and reverse offsets varied throughout the range of Q. longinux, these offsets were consistently predicted to be highest in the northernmost and southeastern populations (Fig. 7a-b).More migration events were predicted for the northern populations, with longer distances to minimize forward offsets in the RCP8.5 model than in the RCP2.6 model (Fig. 7c-f ).

Discussion
We analyzed genomic data and phenotypic traits to explore the genetic architecture of several environmentassociated adaptations of Q. longinux, a dominant evergreen forest tree species on the subtropical island of Taiwan in East Asia.We identified several SNPs with strong effects on adaptation to environmental factors, including some factors that have rarely been discussed in GEA studies (e.g., soil-and wind-related factors).We found that leaf traits were influenced by the interaction of demographic and environmental factors.Moreover, we determined that the populations in northern and southeastern Taiwan are the most vulnerable to future climate change.Finally, we identified populations with unique genetic and phenotypic characteristics in southern Taiwan.These populations are potential targets for conservation efforts in forest management to preserve unique and adaptive genetic resources.

Distinct genetic separation of southern populations from eastern and western populations in Taiwan
Three putative genetic clusters were classified using PCA and StrAuto.The eastern and western clusters were mainly separated from each other by the CMR and were mixed in northern and southern Taiwan.Similar patterns of east-west divergence have been observed for other plants in Taiwan [99,100], implying that the mountain ridges and rugged topography act as profound barriers to gene flow and contribute to the divergence of species occupying low-to-middle elevations in Taiwan.The third cluster, HC, was limited to the Henchung Peninsula in southern Taiwan, which has been identified as the main glacial refuge in Taiwan for other Fagaceae species, such as Q. glauca and Castanopsis carlesii [101,102].The populations of HC were genetically differentiated from other populations, as suggested by higher pairwise F ST and distinct trajectory of changes in Ne in the past 1,000 years.The optimal demographic scenario revealed HC diverged from the rest of the populations dating back to the Last Glacial Maximum (LGM; ∼20Kya) when most of the low-elevation oaks were believed to retreat to the refuge in southern Taiwan [101,102].Also, implying colonization resulting from Pleistocene climate oscillations could prompted the diversification and local adaptation of plants in Taiwan Island.Taken together, our results reveal significant genetic differentiation between the eastern cluster, western cluster, and HC and suggest that HC, morphologically classified as Q. longinux var.kuoi, in the southernmost part of Taiwan have diverged from A distinct group of Q. longinux populations, group HC, with unique leaf and fruit traits in tropical marine climates in southern Taiwan was previously classified as Q. longinux var.kuoi.This variety has no whitish epicuticular wax coating on abaxial surfaces [48].Niche analyses also demonstrated that the habitats harboring Q. longinux var.kuoi were significantly different from those of the other populations, with no ecological overlaps, indicating that Q. longinux var.kuoi may face different environmental pressures.Moreover, we found evidence of adaptive divergence between Q. longinux var.kuoi and other populations.For example, the low spring precipitation and high clay content in habitats in southeastern Taiwan may act as strong environmental stresses that initiate genetic and phenotypic adaptation (e.g., drought resistance) in response to local conditions.In plant species, hostile environmental conditions in edge populations prompt local adaptation processes [103].Distinct environmental pressure, along with a lack of migration between HC and the rest of the Q. longinux population, could speed up the fixation of alleles in a relatively smaller group, further facilitating the genetic and phenotypic adaptation [25,104].The substantial divergence, relatively high genetic diversity, and high offsets of the populations in southern Taiwan indicate that Q. longinux var.kuoi is a conservation unit that should be prioritized for protection as a source of adaptive genetic variations related to high temperature and drought resistance under climate change.

Environmental heterogeneity drives adaptive genetic divergence and phenotypic variation
IBE was the most strongly supported model based on putative adaptive loci, whereas IBR mainly drove genetic differentiation based on all SNPs.PCA also revealed less genetic admixture when genetic differentiation was assessed by GEA outliers compared to neutral SNPs, indicating that the three genetic clusters (i.e., eastern, western, and HC) were exposed to different environmental pressures and had undergone adaptive divergence.However, the partial RDA revealed a large intersection of explained variation (45%) shared by environment, geography, and colonization history, suggesting that environmental variation is highly covaried with other confounding factors.Similar covariations have been observed in the south-to-north postglacial expansion of the red spruce along the Appalachian Mountains which created high collinearity between genetic structure, climate gradients, and geographic distributions [105].Similarly, evergreen subtropical trees in Taiwan underwent south-north expansions after the LGM and may have developed adaptations to latitudinal gradients of temperature and precipitation, resulting in confounding relationships between geography, genetic structure, and adaptation.Consequently, it was challenging to attribute and disentangle the genetic variation explained by each category of predictors, leading to a non-significant impact of pure climate variables.
Leaf shape is affected by various environmental factors, such as precipitation and temperature, which maximize photosynthetic efficiency and adaptation to harsh environments [76,[106][107][108].We found that several leaf traits of Q. longinux were influenced by environmental factors (Fig. S11; Table S13) and a negative correlation between leaf length and annual precipitation contradicts previous findings [51,109].However, we observed a positive correlation between annual precipitation and wind speed in winter and a negative correlation between annual precipitation and solar radiation in summer (Fig. S14).Strong wind and weaker solar radiation may counteract the effects of increased annual precipitation on leaf length.Similar confounding effects of climatic factors on leaf growth and elongation were observed in Fagus sylvatica [77].We demonstrated that the interaction of environment and geographic relationships mainly contributed to the explained variation in leaf traits.Demographic history provided only a limited contribution to leaf variation, suggesting that phenotypic plasticity or local adaptation contributed by local climate surpasses the impact of demographic history on leaf traits.
Temperature and precipitation are known drivers of genetic adaptation in Fagaceae [34].This study expands the environmental factors considered compared to previous GEA studies and demonstrates that significant selection is initiated by multiple environmental factors in an endemic Quercus species.First, we found that wind speed in cold seasons influenced leaf traits and adaptive genetic variation (Table S10; Fig. S11).Wind intensity has been shown to reduce plant growth and height and increase stem thickness [110,111].Wind also influences transpiration rates and stomatal conductance, indirectly affecting photosynthetic efficiency and water requirements [112,113].Some genes correlated with wind speed were also significantly associated with precipitation-related factors (e.g., ATPD, NF-YCP), implying that elevated evaporation rates caused by strong wind may result in drought-like stress, which plants may respond to through similar genetic pathways.Second, we determined that soil-related variables contributed 60% of the variation in adaptive divergence, indicating critical roles of these variables in local adaptation of Q. longinux.The characteristics of soil particle sizes represent the potential water content and salt stress in local soils.In general, soil particle size is negatively correlated with water availability, implying potential abiotic pressure from water deprivation during dry seasons [36,114].Consistent with this conclusion, we observed a correlation between genes involved in the response to drought and grid size.In summary, our findings provide a new perspective for future GEA studies by indicating that some environmental variables with important but rarely tested physiological impacts can be used to unravel the intricate mechanisms related to plant adaptation.
We identified two significantly enriched functional pathways, oxidative phosphorylation and photosynthesis, which have been implicated in plant adaptation and physiological responses to abiotic stress [115,116].Oxidative phosphorylation helps regulate reactive oxygen species (ROS) generation by plant mitochondria under abiotic stresses [115,116].The efficiency and regulation of photosynthesis strengthen the plant and sustain its growth and development under stressful or unfavorable conditions [117].These results suggest that the identified adaptive SNPs underlie the response to different abiotic stresses.Several genes demonstrating significant associations with environments also have potential functions for adaptation.For example, loci of ADH1 are strongly associated with annual mean temperature and precipitation in spring.ADH1 is responsive to multiple abiotic stimuli, including low temperature, hypoxia, flooding, salt, and dehydration [118][119][120].In legumes, ADH1 is a target of miRNA regulation under water-deficit to coordinate ROS levels [121].Under stressful cold situations, ADH1 plays a crucial role in maintaining the stability of membrane structure to enhance cold resistance in plants [115].Other genes involved in precipitation-and temperatureassociated adaptation include orthologs of AT2G40435, which is involved in responses to biotic and abiotic stresses [122,123]; NET1D, which is expressed in root tissues and mediates uptake in response to stress [124]; and HSP70, which stabilizes and refolds heat-inactivated proteins to protect cells from heat damage [125].

Quercus longinux
The projections of genetic offset from GF and RONA revealed that the populations in northern Taiwan might experience the most considerable turnover of genetic composition to cope with future climate change (Fig. 6).Winter precipitation in northern Taiwan is expected to more than double according to both emission models (current: 302 mm, RCP2.6: 682 mm, RCP8.5: 635 mm).The drastic increase in winter precipitation will negatively impact forest productivity [126] because the complex relationship between precipitation and water availability affects plant growth and phenology [126,127].Winter precipitation significantly impacts the phenology of oaks, including the onset and duration of flowering, bud bursting, and leaf flushing, and thus may greatly affect the likelihood and extent of gene flow between populations [128].Considering the long generation time of oaks and the difficulty of juvenile growth in occupied forests, the expected changes in allele frequency to adapt to increased winter precipitation in the northern populations (RONA > 0.6) may not be achievable through standing genetic variation alone.
Under the emission model of intensified global warming (RCP8.5),southward migration over long distances (> 200 km) will increase to minimize forward genetic offset.Although we accounted for migration in estimating genetic offset, the northernmost and southeastern populations consistently showed relatively high local, forward, and reverse offsets.Our results indicate that no current populations across the distribution of Q. longinux are preadapted to future climates in these regions.The northern and southeastern populations are crucial genetic sources for climatic adaptation in other regions and should be prioritized in conservation strategies and protection efforts.Moreover, the rugged topography in mountainous regions may further hamper the movement of populations to higher elevations.Assisted gene flow from other populations preadapted to future climates may help marginal populations mitigate the effects of climate change [129].For example, southern populations (e.g., HC) may act as potential sources of adaptation to high temperatures and seasonal arid climates for populations at higher altitudes and latitudes where higher future temperatures are predicted.However, the source populations must be selected carefully to ensure genetic compatibility with the sink populations and new environments [129].

Fig. 1
Fig. 1 Spatial genetic structure and diversity of Q. longinux in Taiwan.(a) Sampling populations annotated with ancestral components inferred by StrAuto according to K = 3.(b) Results of StrAuto from K = 2 to 3. The height of each colored segment represents the possible ancestry of each individual derived from inferred ancestors.(c) Results of PCA with different colors reflecting different groups.(d) Results of the EEMS analysis of regions significantly deviating from isolation by distance (IBD).The blue, white, and orange colors illustrate regions with high dispersal rates (dispersal corridor), IBD, and low dispersal rates (dispersal barrier), respectively.(e) Results of the effective diversity (i.e., the modeled dissimilarity between pairs of individuals from the same location) estimated by the EEMS analysis.Blue and orange represent regions of high and low genetic diversity, respectively.The black dots indicate the sampling locations

Fig. 2
Fig. 2 Results of the genetic scans.The Venn diagram shows the overlap of the loci identified by each method

Fig. 3
Fig. 3 Landscape genetics of Q. longinux and factors influencing genetic divergence.(a) Isolation by distance (IBD) analysis and three isolation by environment (IBE) analyses based on adaptive sites (green dots and upper lines) or all SNP sites (red dots and lower lines).(b) Results of reciprocal causal modeling using all SNP sites or (c) adaptive loci.The y-axis represents focal models, and the x-axis represents alternative models.The colors of the heatmap indicate differences in Mantel's R between the focal and alternative models.Fully supported focal models are marked with asterisks

Fig. 4
Fig. 4 Results of environment-dependent outlier detection.(a) Associations between allele frequencies and environmental variables.The black dots represent the average allele frequencies of all adaptive loci correlated with WIND10, PREC04, CLYPPT, and BIO01 for each population.(b) Manhattan plots of SNP sites associated with WIND10, PREC04, CLYPPT, and BIO01.The orange dots represent significant loci at p < 0.05.Selected candidate genes are labeled in the plots at their respective genomic positions

Fig. 5
Fig. 5 Variations in leaf traits and results of partial RDA.(a) Results of PCA colored by different genetic groups.(b) Partial RDA partitioning sources of phenotypic variation into different environmental factors.Environmental factors are depicted with red arrows.(c) Partial RDA of environmental factors, geography, and demography.(d) Partial RDA partitioning explained variation into climatic, soil, and topographical variables.The values are the explained variation.Values < 0 are not shown

Fig. 6
Fig. 6 Prediction of genetic offset under future climate change based on (a, b, c) RONA and (d, e, f ) GF methods.(a), (b), and (c) reflect the RONA values estimated for PREC10, BIO12, and BIO03.The color gradients represent the average differences in climatic variables between current conditions and 2070 climate change scenarios.(d) RGB map of the first three PC axes based on the GF prediction, which depicts the genetic turnover of adaptive loci.(e) and (f ) illustrate the genetic offset throughout the range of Q. longinux under the RCP2.6 (c) and RCP8.5 (d) scenarios in 2070.The colors of the cells represent the values of genetic offset estimated based on the GF procedure

Fig. 7
Fig. 7 Map of the GDM-predicted genetic offset across the distribution of Q. longinux under the RCP2.6 (a, c, e) and RCP8.5 (b, d, f ) climate change scenarios.(a) and (b) show the RGB maps of local (red), forward (green), and reverse (blue) offsets under RCP2.6(a) and RCP8.5 (b) in 2070.Brighter cells represent relatively high values along each of the three axes, whereas darker cells indicate relatively low values.(c) and (d) depict the direction of forward offsets among the distribution range of Q. longinux under RCP2.6 (c) and RCP8.5 (d) in 2070.(e) and (f ) represent the estimated distances between source cells with the lowest forward offset to sink cells under RCP2.6 (e) and RCP8.5 (f ) in 2070